NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Microscopic modeling of attention-based movement behaviors

https://doi.org/10.1016/j.trc.2024.104583

Li, Danrui; Schwartz, Mathew; Sohn, Samuel S; Yoon, Sejong; Pavlovic, Vladimir; Kapadia, Mubbasir (May 2024, Transportation Research Part C: Emerging Technologies)

Full Text Available
Toward Realistic Human Crowd Simulations with Data-Driven Parameter Space Exploration

https://doi.org/10.1109/AIxVR59861.2024.00035

Hu, Kaidong; Yoon, Sejong; Pavlovic, Vladimir; Kapadia, Mubbasir (January 2024, IEEE)

Full Text Available
Learning from Synthetic Human Group Activities

Chang, Che-Jui; Li, Danrui; Patel, Deep; Goes, Parth; Zhou, Honglu; Moon, Seonghyeon; Sohn, Samuel S; Yoon, Sejong; Pavlovic, Vladimir; Kapadia, Mubbasir (July 2024, The IEEE/CVF Conference on Computer Vision and Pattern Recognition)

The study of complex human interactions and group activities has become a focal point in human-centric computer vision. However, progress in related tasks is often hindered by the challenges of obtaining large-scale labeled datasets from real-world scenarios. To address the limitation, we introduce M3Act, a synthetic data generator for multi-view multi-group multi-person human atomic actions and group activities. Powered by Unity Engine, M3Act features multiple semantic groups, highly diverse and photorealistic images, and a comprehensive set of annotations, which facilitates the learning of human-centered tasks across singleperson, multi-person, and multi-group conditions. We demonstrate the advantages of M3Act across three core experiments. The results suggest our synthetic dataset can significantly improve the performance of several downstream methods and replace real-world datasets to reduce cost. Notably, M3Act improves the state-of-the-art MOTRv2 on DanceTrack dataset, leading to a hop on the leaderboard from 10th to 2nd place. Moreover, M3Act opens new research for controllable 3D group activity generation. We define multiple metrics and propose a competitive baseline for the novel task. Our code and data are available at our project page: http://cjerry1243.github.io/M3Act.
more » « less
Full Text Available
MSI: Maximize Support-Set Information for Few-Shot Segmentation

Moon, Seonghyeon; Sohn, Samuel S.; Zhou, Honglu; Yoon, Sejong; Pavlovic, Vladimir; Khan, Muhammad Haris; Kapadia, Mubbasir (October 2023, International Conference on Computer Vision (ICCV))

FSS (Few-shot segmentation) aims to segment a target class using a small number of labeled images (support set). To extract information relevant to the target class, a dominant approach in best performing FSS methods removes background features using a support mask. We observe that this feature excision through a limiting support mask introduces an information bottleneck in several challenging FSS cases, e.g., for small targets and/or inaccurate target boundaries. To this end, we present a novel method (MSI), which maximizes the support-set information by exploiting two complementary sources of features to generate super correlation maps. We validate the effectiveness of our approach by instantiating it into three recent and strong FSS methods. Experimental results on several publicly available FSS benchmarks show that our proposed method consistently improves performance by visible margins and leads to faster convergence. Our code and trained models are available at: https://github.com/moonsh/MSI-Maximize-Support-Set-Information
more » « less
Full Text Available
The IVI Lab entry to the GENEA Challenge 2022 – A Tacotron2 Based Method for Co-Speech Gesture Generation With Locality-Constraint Attention Mechanism

https://doi.org/10.1145/3536221.3558060

Chang, Che-Jui; Zhang, Sen; Kapadia, Mubbasir (November 2022, GENEA Challenge 2022)

Full Text Available
The Importance of Multimodal Emotion Conditioning and Affect Consistency for Embodied Conversational Agents

https://doi.org/10.1145/3581641.3584045

Chang, Che-Jui; Sohn, Samuel S; Zhang, Sen; Jayashankar, Rajath; Usman, Muhammad; Kapadia, Mubbasir (March 2023, Intelligent User Interfaces 2023)

Full Text Available
D-HYPR: Harnessing Neighborhood Modeling and Asymmetry Preservation for Digraph Representation Learning

https://doi.org/10.1145/3511808.3557344

Zhou, Honglu; Chegu, Advith; Sohn, Samuel S.; Fu, Zuohui; de Melo, Gerard; Kapadia, Mubbasir (October 2022, Proceedings of the 31st ACM International Conference on Information & Knowledge Management)

Full Text Available
Disentangling audio content and emotion with adaptive instance normalization for expressive facial animation synthesis

https://doi.org/10.1002/cav.2076

Chang, Che‐Jui; Zhao, Long; Zhang, Sen; Kapadia, Mubbasir (June 2022, Computer Animation and Virtual Worlds)

Full Text Available
A2X: An end-to-end framework for assessing agent and environment interactions in multimodal human trajectory prediction

https://doi.org/10.1016/j.cag.2022.05.010

Sohn, Samuel S.; Lee, Mihee; Moon, Seonghyeon; Qiao, Gang; Usman, Muhammad; Yoon, Sejong; Pavlovic, Vladimir; Kapadia, Mubbasir (August 2022, Computers & Graphics)

Full Text Available
Harnessing Fourier Isovists and Geodesic Interaction for Long-Term Crowd Flow Prediction

https://doi.org/10.24963/ijcai.2022/185

Sohn, Samuel S.; Moon, Seonghyeon; Zhou, Honglu; Lee, Mihee; Yoon, Sejong; Pavlovic, Vladimir; Kapadia, Mubbasir (January 2022, Thirty-First International Joint Conference on Artificial Intelligence (IJCAI))

Full Text Available

« Prev Next »

Search for: All records